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Summary 

In this work we present a simple estimation procedure for a general frailty model for 
analysis of prospective correlated failure times. Earlier work showed this method to 
perform well in a simulation study. Here we provide rigorous large-sample theory for 
the proposed estimators of both the regression coefficient vector and the dependence 
parameter, including consistent variance estimators. 
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1 Introduction 



Many epidemiological studies involve failure times that are clustered into groups, such as 
families or schools. In this setting, unobserved characteristics shared by the members of 
the same cluster (e.g. genetic information or unmeasured shared environmental exposures) 
could influence time to the studied event. Frailty models express within cluster depen- 
dence through a shared unobservable random effect. Estimation in the frailty model has 
received much attention under various frailty distributions, including gamma (Gill, 1985, 
1989; Nielsen et al., 1992; Klein 1992, among others), positive stable (Hougaard, 1986; 
Fine et al., 2003), inverse Gaussian, compound Poisson (Henderson and Oman, 1999) 
and log-normal (McGilchrist, 1993; Ripatti and Palmgren, 2000; Vaida and Xu, 2000, 
among others). Hougaard (2000) provides a comprehensive review of the properties of 
the various frailty distributions. In a frailty model, the parameters of interest typically are 
the regression coefficients, the cumulative baseline hazard function, and the dependence 
parameters in the random effect distribution. 

Since the frailties are latent covariates, the Expectation-Maximization (EM) algorithm 
is a natural estimation tool, with the latent covariates estimated in the E-step and the 
likelihood maximized in the M-step by substituting the estimated latent quantities. Gill 
(1985), Nielsen et al. (1992) and Klein (1992) discussed EM-based maximum hkelihood 
estimation for the semiparametric gamma frailty model. One problem with the EM al- 
gorithm is that variance estimates for the estimated parameters are not readily available 
(Louis, 1982; Gill, 1989; Nielsen et al, 1992; Andersen et al, 1997). It was suggested 
(Gill, 1989; Nielsen et al, 1992) that a nonparametric information calculation could yield 
consistent variance estimators. Parner (1998), building on Murphy (1994, 1995), proved 
the consistency and asymptotic normality of the maximum likelihood estimator in the 
gamma frailty model. Parner also presented a consistent estimator of the limiting covari- 
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ance matrix of the estimator based on inverting a discrete observed information matrix. 
He noted that since the dimension of the observed information matrix is the dimension 
of the regression coefficient vector plus the number of observed survival times, inverting 
the matrix is practically infeasible for a large data set with many distinct failure times. 
Thus, he proposed another covariance estimator based on solving a discrete version of a 
second order Sturm-Liouville equation. This covariance estimator requires substantially 
less computational effort, but still is not so simple to implement. 

We (Gorfine et al. 2006) developed a new method that can handle any parametric 
frailty distribution with finite moments. Nonconjugate frailty distributions can be han- 
dled by a simple univariate numerical integration over the frailty distribution. Our new 
method possesses a number of desirable properties: a non-iterative procedure for estimat- 
ing the cumulative hazard function; consistency and asymptotic normality of parameter 
estimates; a direct consistent covariance estimator; and easy computation and implemen- 
tation. The method was found to perform well in a simulation study and the results are 
very similar to those of the EM-based method. Indeed, on a dataset-by-dataset basis, the 
correlation between our estimator and the EM estimator was found to be 95% for the 
covariate regression parameter and 98-99% for the within-cluster dependence parameter. 

The purpose of the current paper is to present the theoretical justification for the 
method in detail. Section 2 presents the estimation procedure. Section 3 presents the 
consistency and asymptotic normality results, along with the covariance estimator for the 
parameter estimates. Section 4 presents the technical conditions required for our results 
and the proofs. 
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2 The Proposed Approach 



Consider n families, with family i containing rrii members, i — 1, . . . ,n. Let 5ij — I{Tfj < 
Cij) be a failure indicator where and Cij are the failure and censoring times, respec- 
tively, for individual ij. Also let Tij — min(T^°, Cjj) be the observed follow-up time and 
Zij be a p X 1 vector of covariates. In addition, we associate with family i an unobservable 
family- level covariate Wi, the "frailty", which induces dependence among family mem- 
bers. The conditional hazard function for individual ij conditional on the family frailty 
Wi, is assumed to take the form 

\.j[t) = WiXo{t) exp(^^Zij) i = l,...,n j = 1, . . . , 

where Aq is an unspecified conditional baseline hazard and /3 is a p x 1 vector of unknown 
regression coefficients. This is an extension of the Cox (1972) proportional hazards model, 
with the hazard function for an individual in family i multiplied by Wi. We assume that, 
given Zij and Wi, the censoring is independent and noninformative for Wi and (/3, Aq) 
(Andersen et al., 1993, Sec. III. 2. 3). We assume further that the frailty Wi is independent 
of Zij and has a density f{w;9), where 9 is an unknown parameter. For simplicity we 
assume that 6' is a scalar, but the development extends readily to the case where 6* is a 
vector. Let r be the end of the observation period. The full likehhood of the data then 
can be written as 

= n;Linf^i{Ao(r,,)exp(/3^z,,.)}'-n;Li/ w""-^-^ eM-^HiXr)}f{w)dw, (1) 

where Nij{t) = 5ijI{Tij < t), Ni,{t) = j:T=iNij{t), Hij{t) = Ao(ri,- A exp(/3^Zi,), 
a/\h — min{a, 6}, Ao(-) is the baseline cumulative hazard function, Sij{-) is the conditional 
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survival function of subject ij, and Hi (t) = Y^^=iHij{t). The log-likelihood is given by 

^ = E E log{Ao(T.,) exp(/3^Z,,)} + E log / ^^''^"^ exp{-wH,Xr)}fiw)dw . 
1=1 j=i j=i ^ 

The normalized scores (log- likelihood derivatives) for {(3i, . . . , f]p) are given by 

1 - 1 " [ET=iH.,m,)Z.,r] Jw^^ (^^+'exp{-wHdT)}fiw)dw 



n~{~{ Jw^-(^')exp{-wHiXT)}f{w)dw 

for r = 1, . . . ,p. The normalized score for 6 is 

_ 1 " exp{-wH,Xr)}f'{w)dw 
~ n ^ J w^^ (^^ exp{-wH,,{r)}fiw)dw 

where f'{w) = ^f{w). Let 7 = (/3^,e) and U(7,Ao) = (f/i, . . . , f/p, f/p+i)^. To obtain 

estimators ^ and ^, we propose to substitute an estimator of Aq, denoted by Aq, into the 

equations U(7,Ao) = 0. 

Let Yij(t) = I(Tij > t) and let J^t denote the entire observed history up to time t, that 

is 

J^t = a{Nij{u),Yij{u), Zij,i = 1, . . . , n; j = 1, . . . , m^; < u < t}. 

Then, as discussed by Gill (1992) and Parner (1998), the stochastic intensity process for 
Nij(t) with respect to J-'t is given by 

Ao(t) expi(3'^Zi,)Yi,{t)Ml, Ao, t-), (3) 

where 

^,{j,Ao,t)=E{W,\J^t). 

Using a Bayes theorem argument and the joint density with observation time restricted 
to [0, t), we obtain 

ijiij, A, t) = (/.2i(7, A, t)/(f)uh, A, t), 

where 

0fci(7,Ao,t)= fw''-^'^+^'-''>exp{-wHdt)}fiw)dw, k = l,...,4. 



Given the intensity model Q, in which exp{0^Z)'ipi{'f, Aq, t—) may be regarded as a time 
dependent covariate effect, a natural estimator of Aq is a Breslow (1974) type estimator 
along the lines of Zucker (2005). For given values of (3 and 6 we estimate Aq as a step 
function with jumps at the observed failure times Tk, k = 1, . . . , K, with 

AAo(rfc) = . (4) 

Er=i ^.(7, Ao, Tfc-i) Er=i exp(/3%) 

where dk is the number of failures at time Tk- Note that given the intensity model Q, the 
estimator of the kth jump depends on Aq up to and including time Tk-i- By this approach, 
we avoid complicating the iterative optimization process with a further iterative scheme, 
for estimating the cumulative hazard. 



3 Asymptotic Properties 

Let 7° = (/3°^, 6°)^ with f3°, 6° and AQ(t) denoting the respective true values of /3, 6 and 
Ao(t), and let 7 = (/3 ,0)^. We assume the technical conditions listed in Section 4.1. 

In Section 4.3, we establish the following results, using arguments patterned after 
Zucker (2005, Appendix A.3). 

A. Ao(t,7) converges almost surely to Ao(t,7) uniformly in t and 7. 

B. U(7, Ao(-,7)) converges almost surely uniformly in t and 7 to a limit u(7, Ao(-, 7)). 

C. There exists a unique consistent root to U(7, Ao(-, 7)) = 0. 

In Section 4.4, we show that 72^/^(7 — 7°) is asymptotically normally distributed. We 
accomplish this by analyzing in turn each of the terms in the following decomposition: 

0= U(7,Ao(-,7)) 
= U(7°, A°) + [U(7°, Ao(-, Y)) - U(7°, Aq)] 
+ [U(7,Ao(-,7))-U(7°,Ao(-,7°))]- 
5 



We show further that the covariance matrix of 7 can be consistently estimated by the 

sandwich estimator 

D-'(7){V(7) + G(7) + C(7)}D-i(7)^. (5) 

The matrix D consists of the derivatives of the U^s with respect to the parameters 7. V 
is the asymptotic covariance matrix of U(7°, Aq), G is the asymptotic covariance matrix 
of [U(7°, Ao(-, 7°)) — U(7°,Ao)], and C is the asymptotic covariance matrix between 
U(7°, A^) and [U(7°, Ao(-, 7°)) - U(7°, A^)]. The term G + C reflects the added variance 
resulting from the need to estimate the cumulative hazard function. All the above matrices 
are defined explicitly in Section 4.4. 

4 Technical Conditions and Proofs 

This section presents the technical conditions we assume for the asymptotic results and 
the proofs of these results. 

4.1 Technical Conditions 

In deriving the asymptotic properties of 7 we make the following assumptions: 

1. The random vectors (T^^i, . . . , T°^.,Cii, Qmi, Za, • • • , Zj^., PVi), i = 1, . . . , n, are 
independent and identically distributed. 

2. There is a finite maximum follow-up time r > 0, with E[J2]=i ^^(7")] = y* > for 
all i. 

3. (a) Conditional on Zjj and Wi, the censoring is independent and noninformative 

oiWi and (/3,Ao). 

(b) Wi is independent of and of mj. 
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4. The frailty random variable Wi has finite moments up to order (m + 2), where m is 
a fixed upper bound on rrii. 

5. Zij is bounded. 

6. The parameter 7 lies in a compact subset Q of IR^"^^ containing an open neighbor- 
hood of 7°. 

7. There exist 6 > and C > such that 

lim w-^^-^^f(w) = C. 

8. The baseline hazard function X^^ii) is bounded over [0,t] by some constant 

9. The function f'{w;9) — {d/d9)f{w;9) is absolutely integrable. 

10. The censoring distribution has at most finitely many jumps on [0, r]. 

11. The matrix [(9/97)U(7, Ao(-, 7))]|-y=-yo is invertible with probability going to 1 as 
n — > 00. 

4.2 Technical Preliminaries 

Since /3 and Zj^ are bounded, there exists a constant u > such that 

i/"^ < exp(/3^Zij) < i/. (6) 

Now recall that 

with Hi.{t) = Hi.{t, 7, A) = J2T=i HTij A t) exp{j3^Zij) (here we define Hi. so as to allow 
dependence on a general 7 and A, which will often not be explicitly indicated in the 
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notation). Define (for < r < m and h > 0) 

Also define i>^in{h) = mino<r.<m i>*ir, h) and ip^axW = niaxo<r<m ip*ir, h). In tlie expres- 
sion for ^/'*(r, h), tlie numerator and denominator are bounded above since W is assumed 
to fiave finite (m + 2)-tli moment. In addition, since W is nondegenerate, the numera- 
tor and denominator are strictly positive. Thus 'ipmaxW is finite and ipminW is strictly 
positive. 

Lemma 1: The function ilj*{r, h) is decreasing in h. Hence for all 7 G ^ and all t, 



^,(7,A,t) > rrmnimuAit)). 

In addition, there exist B > and h > such that, for all h > h, 



(7) 
(8) 



Proof: We have 
d 



dh 



4>*{r, h) 



(9) 



(10) 



/ w'-e-'^'"f{w)dw \ J w''e-'^'"f{w)dw J 
This is negative for all h, and so ?/'*(r, h) is decreasing in h. Now A, t) = ilj*{Ni{t), Hi.{t)). 
Since < Hi.{t) < mz/A(t), ((7j) and (jH)) follow. As for Q, from a change of variable and 
Assumption 7, 

r + b. 



lim hip*{r, h) = ° _ 



Now just take h large enough so that this limit is obtained up to some factor, e.g. 1.01. 

Lemma 2: Define A = 1.03e"^'^ h/{mu), with a = l.Olmu'^ /{By*), with h and B as above. 
Then, with probability one, there exists n' such that, for all t G [0, r] and j E Q, 



Ao(t,7) < A for n > n', 



:iii 



Thus, Ao(t,7) is naturally bounded, with no need to impose an upper bound artificially. 
Proof: To simplify the writing below, we will suppress the argument 7 in Ao(t, 7). Recall 



AAo(rfc) 



i=l 



where we now take = 1 since the survival time distribution is assumed continuous. 
Using Lemma 1 and ©, we have 



" i=l j=l 



-1 



By the strong law of large numbers, there exists with probability one some n* such that 

^ii(^) ^ 0.999?/* for n > n*. (12) 

" i=i j=i 
We thus have, for n > n*, 

.1 /l.Olz/' 



AAo(rfc) < n 



y 



(13) 



Now, if Ao(t) < h/{mv) for all t then we are done. Otherwise, there exists such that 
Ao(Tfc) < h/{mv) for k < k' and AqItj.) > h/{mv) for k > k'. Using the last inequality of 



Lemma 1, we obtain, for k > k', 



AAo(rfe) < n-^akoin-i] 



or, in other words. 



a 



Ao(rfc) < 1 + - Mrk-i). 



n 



Iterating the above inequality we get 



a 



n 



a 



Ao{rk'+e) < 1 + - Ao(rfcO < 1 + - Aoirk') < 1.01e™"Ao(rfc 



n 



for n large enough. But, using (fT^ and the fact that Ao(Tjfc/_i) < h/^mu), we have 



which is less than 1.01/i/(mz/) for n large enough. The desired conclusion follows. 



Lemma 3: We have sup^gjg,^] |Ao(s,7°) — Ao(s— ,7°)| — ^> as n oo, as an immediate 
consequence of Lemma 2 and (fT^ . 

4.3 Consistency 

We now show the almost sure consistency of (3 and Aq. The argument is built on 
Claims A-C of Section 3, which we prove below. Our argument follows Zucker (2005, 
Appendix A.3). 

Claim A: Ao(t, 7) converges a.s. to some function Ao(t, 7) uniformly in t and 7. 
Proof: Whenever a functional norm is written below, the relevant uniform norm is 
intended. We define A^ax = niax(A, Ama^^r) and tp**{r,h) = ■ip*{r,h A hmax), where 
hmax = ^v^max- It is easy to see from (fTTH) that ilj**{r,h) is Lipschitz continuous in 
h (uniformly in r). Recall that ilJi{'y,A,t) = tp*{Ni{t), Hi.{t,^, A)). Lemma 2 implies 
that ifi.(t,7, Ao(-,7)) < h^ax for all t G [0, r] and -y e Q. Hence ?/^i(7, Ao(-, 7), t) = 

Now define, for a general function A, 

^ n-'El=iJ:T=idN.,{s) 



and 



^ io E[E7=i^**(iV.(3-),//..(3-,7,A))>^.,(5-)exp(/3%.)] 
By definition, Ao(t,7) satisfies the equation 

Ao(t,7) = 2„(t,7,Ao(-,7))- (14) 
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Next, define 

. ^ EE7=i rms-), HUs-, 7°, Ag))y.,(^) exp(/3°^Z,,)] 
^ E[Er=i^**(iV,(s-),i/..(.-,7,A))l^.,(s)exp(/3^Z,,)] 

This function is uniformly bounded by B* = [ipmaxi^) /'^rnini^max)]^max- Moreover, by tlie 
Lipscliitz continuity of ■?/'** (r, h) with respect to h, it satisfies a Lipschitz-Iike condition of 
the form |g-j,(s, Ai) — q^{s, A2)\ < -ft'supo<„<s |Ai('u) — A2('u)|. Hence, by mimicking the 
argument of Hartman (1973, Theorem 1.1), we find that the equation A{t) = 7, A) 
has a unique solution, which we denote by Ao(t,7). The claim then is that Ao(t, 7) 
converges almost surely (uniformly in t and 7) to Ao(t,7). Though it may be possible to 
prove this claim directly, we shall use a convenient indirect argument. 

Define AQ"''(t,7) to be a modified version of Ao(t,7) defined by linear interpolation 
between the jumps. Lemma 3 implies that, with probability one, 

sup|A[,")(t,7)-Ao(t,7)l ^0, (15) 



and thus 



sup|H„(t,7,Ao(t,7)) -2„(t,7,Ao(t,7))l ^ 0. (16) 



Lemma 2 shows that the family C = {Ao"^(t,7), n > n'} is uniformly bounded. We can 
show further that C is equicontinuous. This is done as follows. 

Recall that Ni{t) = Ej^^ Nij{t). Write N{t) = n-^ ELi Ej^i Nij{t). We have N{t) 
E[Ni{t)] as n — s> 00 uniformly in t with probability one, with 



Jo 



XQ{s)ds. 



In view of this and (fT^ there exists a probability-one set of realizations Q* on which the 
following holds: for any given e > 0, we can find n"(e) such that sup^ \N{t) — E[Ni{t)] \ < 
e/{AB°) for all n > n"{e), where B° = l.Olu /[ilj^^^{hmax)y*]- In consequence, for all t and 
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u with M < t, we find that 



Ao(t,7) - Ao(w,7) 
satisfies 



Ao(t,7) - Ao(m,7) < + I for all n > n"(e). (17) 

Moreover, it is easy to see that Ao(t, 7) is Lipschitz continuous in 7 with Lipschitz constant 
C*, say, that is independent of t. 

These two results imply that C is equicontinuous. This is seen as follows. For given e, 
we need to find 61 and ^2 such that |Ag"''(t, 7) — Ag"''(u, 7)] < e whenever |t — < 61 and 
|Ao"''(t,7) — Ao"''(t,7')| < e whenever ||7 — 7'|| < 62- The latter is easily obtained using 
the Lipschitz continuity of Ao(t, 7) with respect to 7. As for the former, for n > n"{t) this 
can be accomplished using ()17p. while for n in the finite set n' < n < n"{t) this can be 
accomplished using the fact that the function AQ"''(t,7) is uniformly continuous on [0, r] 
for every given n. 

We have thus shown that C is (almost surely) a relatively compact set in the space 
C([0,r] x^). 
Next, define 

1 n mi 

A(7,A,s) = -Y,Y.r\N,{s-),H,.{s-,-i,^))Y,,{s)eM0''^^j). 

^ i=i j=i 



0(7, A, s) = E 



^^-(iV.(s-),/7,.(s-,7,A))r,,(s)exp(/3^Z, 



For any fixed continuous A, the functional strong law of large numbers of Andersen & 
Gill (1982, Appendix III) implies that 

sup |A(7, A,s) - a(7, A,s)| ^ a.s. (18) 

Here we need the following more complex result: 

sup|v4(7,A("),s)-a(7,A("),s)| -^0 a.s. (19) 
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The proof of (jl9|) is lengthy; we give the details in Section 4.5 below. In outline form, 
the proof involves two steps: (1) showing that, for any given e > 0, we can define an 
appropriate finite class C* of functions A such that A*^") can be suitably approximated by 
some member of the class; (2) applying the result (|18|). which will hold uniformly over 
the finite class. 

Given (fT^ and the a.s. uniform convergence of N{t) to E[A^j(t)], we can infer that 

sup|S„(t,7,AS"Ht,7))-S(t,7,A(")(t,7))| ^0 a.s. (20) 

The result (PU)) is easily obtained by adapting the argument of Aalen (1976, Lemma 6.1), 
using the equicontinuity of C. It is here that we use Assumption 10, for the adaptation of 
Aalen's argument requires 0(7, A, s) to be piecewise continuous with finite left and right 
limits at each point of discontinuity. 

From (HH), (Uni), (HSl), and ^ it follows that any limit point of {A^^\t,j)} must 
satisfy the equation A = E(t, 7, A). Since Ao(t, 7) is the unique solution of this equation, 
it is the unique limit point of {A.l^\t,j)}. Thus {Ao"^(t, 7)} is a sequence in a compact 
set with unique limit point Ao(t, 7). Hence AQ"''(t, 7) converges a.s. uniformly in t and 7 
to Ao(t,7). In view of (fT3jl . the same holds of Ao(t, 7), which is the desired result. Note 
that Ao(-,7°) = Aq(-) since Aq trivially solves the equation A = S(t, 7°, A). 

Claim B: With u(7, Ao(-, 7)) = E[U(7, Ao(-, 7))], we have U(7, Ao(-, 7)) ^ "(7, Ao(-, 7)) 
uniformly in 7 G ^ with probability one. 

Proof: Since U(7, Ao(-, 7)) is the mean of iid terms, the functional strong law of numbers 
of Andersen & Gill (1982, Appendix III) implies that U(7, Ao(-,7)) converges uniformly 
in 7 almost surely to u(7, Ao(-, 7)). It remains only to show that 

sup ||U(7, Ao(-, 7)) - U(7, Ao(-, 7)) II ^ (21) 
13 



almost surely. The structure of U(7, A) reveals that there exists some constant C° (in- 
dependent of 7) such that ||U(7, Ai) — U(7, A2)|| < C°||Ai — A2II. /,From this along with 
Claim A, (EH) follows. 

Claim C: There exists a unique consistent root to U(7, Ao(-,7)) = 0. 
Proof: We apply Foutz's (1977) consistency theorem for maximum likelihood type esti- 
mators. The following conditions must be established: 

Fl. 9U(7, Ao(-, 7))/57 exists and is continuous in an open neighborhood about 7°. 
F2. The convergence of 9U(7, Ao(-, 7))/i97 to its limit is uniform in open neighborhood 
of 7°. 

F3. U(7°, Ao(-, 7°)) ^ as n ^ 00. 

F4. The matrix — [(9U(7, Ao(-,7))/57]|-),=-yo is invertible with probability going to 1 as 
n 00. (In Foutz's paper, the matrix in question is symmetric, and so he stated the 
condition in terms of positive definiteness. But his proof, which is based on the inverse 
function theorem, shows that the basic condition needed is invertibility.) 

It is easily seen that Condition Fl holds. Given Assumptions 2, 4, and 5, Condition F2 
follows from the previously-cited functional law of large numbers. As for Condition F3, 
in Claim B we showed that U(7, Ao(-, 7)) converges a.s. uniformly to u(7, Ao(-,7)) = 
E[U(7, Ao(-, 7))]. We noted already that Ao(-,7°) = Ao(-). Thus we need only show that 
E[U(7°,Ao)] = 0. Since U is a score function derived from a classical iid likelihood, 
this result follows from classical likelihood theory. Condition F4 has been assumed in 
Assumption 11. With Conditions F1-F4 established, the result follows. 
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4.4 Asymptotic Normality 

To show that 7 is asymptotically normally distributed, we write 

0= U(7,Ao(-,7)) 

= U(7°, A°) + [U(7°, Ao(-, 7°)) - U(7°, A°)] 
+ [U(7,Ao(-,7))-U(7°,Ao(-,7°))] 

In the following we consider each of the above terms of the right-hand side of the equation. 
Step I 

We can write U(7°, Aq) = J2i=i ^i, where is a {p+ l)-vector with r-th element, 
= given by 

ET=lH^J{r)Z,JIw^-^^^^'cxp{-w{Har)}f{w■,e)dw 



Cir E ^ij^ijr J (,) Q^p{_^HiXr)}f{w; e)dw 

and (p + l)-th element given by 

_ /w^^.W cx'^{-wHi{T)}f'{w;e)dw 
~ / w^- (^) exp{ -wHi, {T)}f{w]e)dw' 

Thus U(7°,Aq) is the mean of the iid mean- zero random vectors ^j. It hence follows 
from the central limit theorem that n^U(7°, Aq) is asymptotically mean-zero multivariate 
normal. To estimate the covariance matrix, let ^* be the counterpart of with estimates 
of 7 and Aq substituted for the true values. Then an empirical estimator of the covariance 
matrix is given by V(7) = J27=i €i £1^- This is a consistent estimator of the covariance 
matrix since Ao(i,7) converges to Ao(i,7) a.s. uniformly in t and 7 (Claim A), and 7 is 
a consistent estimator of 7° (Claim C). 
Step II 

Let Ur — Ur{^°, Aq), t — 1, . . . ,p, and Up+\ — C/p+i(7°, Aq) (in this segment of the 
proof, when we write (7°,Ao) the intent is to signify (7°, Ao(-, 7°)). First order Taylor 
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expansion of Ur about Aq, r = l,...,p+l, gives 



n 



-1/2 



E E Q^Jrh°, A°, T,,){Ao(T,„ 7°) - Asm.)} + (22) 



i=i j=i 

where 

I 02t(7°, Ag, r) ^ ^ I'T ^ 7 
0?.(r,Ag,r)''^^-^4l ^^'^ '^'^ 

for r = 1, . . . ,p, and 

n I-.' ^'T\ R- f fe(T°.Ag,r),>if(7°,Ag,r) 4>'i' (l° . K, r) 

«.W7. A, = I Mr,Kr) 

with = exp(/3'^Zjj) and 

0l?(7,Ao,t) = / «;^-«+('=-^)exp{-«;ff,.(t)}r(«;)rf«;, k = 1,2. 

The vahdity of the approximation ()22j) can be seen by an argument similar to that used 
in connection with below. 

Given the intensity process 0, the process 

M^J{t) = Nijit) - f \oiu)exp{f3°''Zi,)Yij{u)iJi{Y,Ao,u-)du 

<J 

is a mean zero martingale with respect to the filtration jFj. Also, by Lemma 3, we have 
that sup^gjQ.^] |Ao(s,7°) — Ao(s— ,7°)| converges to zero. Thus, replacing s— by s we 
obtain the following approximation, uniformly over t G [0,r]: 



Ao(t,7°)-AS(t) ^ -/ {3^(^,AS)}-'EE^^^.(^) 

i=i j=i 

1 /■* r n rrii 

+ -/ [{3^(^>Ao)}-^-{K.,A°)ri]EE^^^^.(^), (23) 
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where 

^ i=i j=i 

Now let W(s, r) = {3^(s, A°q + rA)}-^ with A = Aq - Aq. Define VV and >V as the 
first and second derivative of W with respect to r, respectively. Then, computing the 
necessary derivatives and carrying out a first order Taylor expansion of W(s, r) around 
r = evaluated at r = 1 with Lagrange remainder (Abramowitz & Stegun, 1972, p. 880), 
we get 

{y{s, Ao)}-' - {y{s, A°)}-^ = Ms, 0) + Ims, m) 



exp(/3^Z,,){Ao(Ti, As)- A°(T,, A s)}, (24) 



{y{s, Ao)V 2 

where %(u) = exp(/3^Z,,)y;,(M), i^^w) = ET=iRijiu), r{s) G [0, 1] 



Viiir, s) 



03.(7°, Ag + rA,s) 



(7°,A° + rA,s) 



0ii(7°,Ag + rA,s) i0i,(7°,AS + rA,s)J ' 
and hi{r, s) is as defined in Section 4.6 below, and shown there to be o(l) uniformly in r 
and s. 

Let r]ii{s) = ?7ii(0, s). Plugging pijl into (j23|l we get 



Ao(t,7°)-AS(t)^n-i / {yis,Al)}-'J2J:dM.M 



-n 



—n 



+ n 



-2 



2^ 2^ r.;/ A0U2 ^^P(^ Zki){Ao[s) - Ao(s)} ^ 2^ c/Arij(s) 

k=il=i xyKSi^^Q)! i=ij=i 

' t E ^^^%vf^AoU^'^'^ exp(r Z,0{Ao(T,,) - A°(T,0} £ E rfiV,(.) 

fc=l«=l iJ^l-Sj^^JI i=lj=l 

t n rrik -t n rrii 

E E l^h,{f{s), s) exp(/3^Z,,){Ao(T,0 - A°(T,0} E E dN^A'). 

k=l 1=1 ^ i=\ j=l 



The third term of the above equation can be written, by interchanging the order of 
integration, as 



n^n;^ i?,, (5)7/1^(5) 



--^EEEE 

k=l 1=1 i=lj=l 



exp(/3^ZH) 



{Aoiu) - A°,iu)}dNM} 



dNi,{s) 
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{Ao{s) - A°(s)} E t)dNij{s), 
i=i j=i 



where Nij(t) = I{Tij < t) and 

n,j{s,t)=n-^ / {y{u,Al)}-^Rdu)mi{u)expif3^Z,^)Y.J2dN, 

■J S ,1,1 



k=l 1=1 



Hence we get 

„j n nii 

•^0 i=l .7=1 



J 



i=i 'j=i 

where 

k=l 1=1 

The o(n^^) is uniform in t (see Sec. 4.6 below) and will be dominated by Q and T, which 
are of order n^^. Hence the o{n^^) term can be ignored. 

An argument similar to that of Yang & Prentice (1999) and Zucker (2005) now yields 
the martingale representation 

Ao(t 7°) - A°(t) - ^ r ^(^-)^^i^^i^^^(^) (25) 



where 



Pit) = n 



s<t 

Based on (1221), we can write 



1 + EE{'^^A(^) + 

i=i i=i 



UriY, Ao) - t/.(7°, AS) ^ E E r A^, s){Ao(s, 7°) - Al{s)}dN,,{s). 

Plugging the martingale representation into the above equation and carrying out 
some more algebra (again involving an interchange of integrals) gives 

[/,(7°,Ao)-f/.(7°,A°) 

f\ is 7° f,^)ii^Zm^im^^Md^ (26) 
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where 

7r.(.,7,Ao)=- . 

Therefore, n^/2[u(^°^ Ao(-, 7°)) - U(7°, Ag(-, 7°))] is asymptotically mean zero multivari- 
ate normal with covariance matrix that can be consistently estimated by 

•^0 {3^(s,Ao)}2 

for r, / = 1, . . . ,p + 1. 
Step III 

We now examine the sum of U(7°, Ag) and U(7°, Ao(-, 7°)) - U(7°, Ag). From (j^ . 
we have 

UriY, Ao(-, 7°)) - UriY, A°) ^ / ^ ^ rfMfc,(s) = - ^ /i^., 

"'O A:=l «=1 ^ k=l 

where ar{s) is the limiting value of 77^(5, 7°, Aq)p(s— )/3^(s, Aq) and fikr is defined as 

Arguments in Yang and Prentice (1999, Appendix A) can be used to show that p{s—) has 
a limit. Also, clearly E[/ifcr] = 0. 
We thus have 

1 " 

i=l 

which is a mean of n iid random variables. Hence n^/^{f/,.(7°, Aq) + [t/r(7°, Ao(-, 7°)) — 
Ur{'y°, Aq)]} is asymptotically normally distributed. The covariance matrix may be esti- 
mated by V(7) + 0(7) + C(7), where 

1 " 

Cri(7) = - Y^i^irf^il + Cillhr)^ r, / = 1, . . . ,p + 1, 



with 



/^ir = / —, — r\ 
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and 



Mij{t) ^ Nij{t) - fexp0^Zij)Yij{u)iPi{y,Ao,u-)dAo{u). 

J 



step IV 



First order Taylor expansion of U(7, Ao(-, 7)) about 7° = {^"'^,6°)'^ gives 



U(7, Ao(-, 7)) = U(7°, Ao(-, 7°)) + D(7°)(7 - 7°)"^ + o,{l), 



where 



A.(7)=9C/,(7,Ao(-,7))/97. 



ior I, s — 1, . . . ,p + 1, with 7^+1 = 9. 
For I, s = 1, . . . ,p we have 



Dish) = 



-n 



-i^f 02i(7,Ao,r) g^ dH,,{T,j) 
03i(7.Ao.r) (;6|(7.Ao.r) 



,0u(7,Ao,t) (/)fi(7,Ao,T), 



and 



dPs 



OPs 



dPs 



exp(/3^Zi,) + Ao{Tij A r^) exp(/3^Zi,)Z, 



^ 0h(7, Ao,7-jfc-l) 

0^,(7. Aq. (;63,(7- Ao- a-i) 1 (97^,. (^-1) 



+ 



Ul0ii(7,Ao,Tfc_i) 0H(7,Ao,Tfe_i) J 



Ri.{Tk) 



01i(7) A-o,Tjk_i) j=i 



For / = 1, . . . , p we have 

n -1 V f 02^(7, Ao,r) ^ , aiy,,(T,,) 



^ 1 0h(7, Ao,t) ^-=1 
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9^ 



Q2?(7- A().r) _ 02^(7- -)pi-- (7. Ap, r) 
0h(7,Ao,t) 0?i(7,Ao,T) 



.0ii(7,Ao,r) 0ij(7,Ao,r) 



9^ 



5Z Hij{Tij)Ziji 
J i=i 



and 

^(p+i)/(7) 
Finally, 



1=1 



g(7,Ao,T)02i(7,Ao,T) 
0h(7,Ao,t) 



4?(7,Ao,r) I dH^Xr) 



(/)H(7,Ao,r) I 



^(p+i)(p+i)(7) = ^ 2^ 



+ 



i=l 



0S? (7, Ao, r)02.(7, Ao, r) 0^^^ (7, Aq, r) 



0h(7! Ao, 



^i?(7,Ao,r) 
0ij(7, Ao) 



where 



rH7,Ao,r) 



^2.(7, Ao,t) 



w^'(")exp{-wi/i.(^)} 



0u(7,Ao,t) 



dHij{Tk) _ dAo{T,jAn) 



89 



86 



and 



8AAo{Tk) 
89 



-4 Ij2 ^^'^^'^^'^^-^^ 



Ri.{rk) 



i=l 01i(7, Ao, Tfc.i) ; 

02? (7, Ao, T-fc-i) 02i(7, Ao, rfc_i)0S? (7, Ao, Tk-i) 



i=l 



(7,Ao,rfc_i; 



(7,Ao,rfc_i) 



8Hi,{Tk-i) f02i(7, Ao,rfc_i) 03i(7, Ao,rfc_i) 



9^ 



5'H(7,Ao,rfe_i) 



!'ii(7, Ao,Tjt_i^ 



Step V 



(28) 



(29) 



(30) 



Combining the results above we get that 71^/^(7 — 7°) is asymptotically zero- mean 
normally distributed with a covariance matrix that can be consistently estimated by 



D-^(7){V(7) + G(7) + C(7)}D-'(7)^ 
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4.5 Proof of (nm 

The goal is to prove that 

sup|A(7,A("),s)-a(7,A("),s)| -^0 a.s. (31) 



s,1 

This involves several steps. 

First, it is easy to see that there exists a constant k (independent of 7 and s) such 

that 

sup|A(7,Ai,s) - A(7,A2,s)| < KIIA1-A2II, (32) 
sup |a(7, Ai, s) - a(7, A2, s)| < k||Ai-A2||. (33) 

Next, for any fixed continuous A, the functional strong law of large numbers of Andersen 
& Gill (1982, Appendix III) implies that, with probability one, 

sup |A(7,A, s) - a(7,A, s)| ^ 0. (34) 

Now, given e > 0, define the sets {tf''}, {7^''}, and {A^'^''} to be finite partition 
grids of [0, r], Q, and [0, Amax], respectively, with distance of no more than e between grid 
points. Define C* to be the set of functions of t and 7 defined by linear interpolation 
through vertices of the form {tj'\^^,^\ A^''^). 

Obviously C* is a finite set. Hence, in view of fl34|) . there exists a probability-one set 
of realizations for which 

sup |A(7, A, s) - a(7. A, s)| 0. (35) 

se[0,T],TGe,AG£* 

Define 

00 

n** = fi nye 

£=1 

and i^o = ^* n ^**; with Q* as defined earlier. Clearly Pr(fio) = 1. From now on, we 
restrict attention to Qq- 
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Now let e > be given. Choose i > e ^. In view of (|17p and (|35|). we can find for 
any u E Qq a. suitable positive integer n{e,u) such that, whenever n > n{e,u), 

\~A(-\t,j)-~Al-\u,j)\<B*it-u) + ^ yt,u, (36) 

sup \A{j,A,s)-a{j,A,s)\<e. (37) 

se[o,r],-feG,A&ci^^ 

Next, let Aq"-* denote the function defined by linear interpolation through {t^^\ A^^^) , 
where A^-^'' is the element of {A["'^} that is closest to A'Q'\tj'\^^i^^). It is clear that 

Using ()36p and the Lipschitz continuity of AQ^\t,^) with respect to 7 (which follows from 
the corresponding property of Ao(t,7)), we thus obtain 

sup|A(")(t,7)-A(")(t,7)|<i?**e 

for a suitable fixed constant B** (depending on B* and C*). Combining this with (jHTjl 
and (j33|l . we obtain 

sup|/l(7,A("),s) -a(7,A("),s)| < {2kB** + l)e for all n > n(e, cj). 

Since e was arbitrary, the desired conclusion (jHT|) follows, and the proof is thus complete. 



4.6 Definition and behavior of hi{r,s) 
The quantity hi{r, s) appearing in is given by 



{y{s,A°, + rA)}^fr{ 
where A{Tij A s) = Ao(Ty As) - A^{Tij A s) and 



(7°,A° + rA,s)l' 04.(7°, AS + ^A,s) ^02.(7°, A° + rA, s)03.(7°, Ag + rA, s) 



.ii(7°,Ag + rA,s)J 0ii(7°,Ag + rA,s) {0i,(7°, Ag + rA, s)} 
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For all 2 = l,...,n and s G [0, r], we have < Ri,{s) < fnu, where u is as in Q. 
Moreover, for /c = 1, . . . , 4, we have 

where r^ax = argmaxi<,,<m E(W^'"), rmin = argmini<r<„ E(W^''). Hence, rju and 772^ are 
bounded. In addition, the the proof of Lemma 2 show that y{s, A° + r A) is uniformly 
bounded away from zero for n sufficiently large. Finally, in the consistency proof we 
obtained || A|| = o(l). Therefore hi{r, s) is o(l) uniformly in r and s. 
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